The meme is a table comparing the performance of various AI models on a specific task. The table lists the models' names, followed by their performance scores, with the best-performing model (LLaMA 2) having a score of 87.1%, followed by HellaSwag at 85.5%, and then the other models having lower scores. The table is captioned "AI's performance on a task".
2023-12-18T06:49:32+00:00
LLaMA270B GPT-3.5 _ Mixtral 8x7B MMLU oc a 69.9% 70.0% 10.6% HellaSwag llaSw 87.1% 85.5% 86.7% ARC Challenge 85.1% 85.2% 85.8% (25-shot) WinoGrande 83.2% 81.6% 81.2% MBPP 49.8% 52.2% 60.7% (pass@1) GSM-8K 53.6% 57.1% 58.4% (5-shot) MT Bench (for Instruct Models) 6.86 8.32 8.30